BIASED METROPOLIS-HEAT-BATH ALGORITHM FOR 
FUNDAMENTAL-ADJOINT SU(2) LATTICE GAUGE THEORY 
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For SU(2) lattice gauge theory with the fundamental-adjoint action an efficient heat-bath algorithm 
is not known so that one had to rely on Metropolis simulations supplemented by overrelaxation. 
Implementing a novel biased Metropolis-heat-bath algorithm for this model, we find improvement 
factors in the range 1.45 to 2.06 over conventionally optimized Metropolis simulations. If one 
optimizes further with respect to additional overrelaxation sweeps, the improvement factors are 
found in the range 1.3 to 1.8. 
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I. INTRODUCTION 

Biased Metropolis Algorithms (BMAs) have been in- 
troduced quite some time ago [1], but they have not been 
applied beyond isolated classes of problems. Instead, the 
most frequently used Monte Carlo schemes are the (orig- 
inal) Metropolis Algorithm (MA) and the Heat-Bath Al- 
gorithm (HBA), see [2] for a textbook discussion. Both 
algorithms perform local updates of random variables, 
which, in lattice gauge theory, are matrices on the links 
of a 4D hypercubic lattice. 

In its vanilla form, for lattice gauge theories, the MA 
proposes matrices with the Haar measure of the gauge 
group. This suffers often from low acceptance rates, but 
can be improved by restricting the proposal range to a 
neighborhood of the matrix already in place. However, 
one should keep in mind that too small changes are not 
good either. A low acceptance rate as well as too small 
changes lead to long autocorrelation times. As a general 
rule one should not tune up the acceptance rate to more 
than 30% to 50% of the proposed updates (see, e.g., [2]). 

A way to improve the acceptance rate without restrict- 
ing the proposals to a small range, and paying the price 
of large autocorrelation times, is to perform multiple hits 
on the same matrix. As each hit, apart from some com- 
mon overhead, increases the CPU time needed linearly, 
an optimum is normally reached for a fairly small number 
of hits. 

If one neglects CPU time requirements and counts only 
the number of link-updates the HBA achieves optimal 
performance in this class of local algorithms. By invert- 
ing the relevant cumulative distribution function it deliv- 
ers the same results as a multi-hit Metropolis algorithm 
in the limit of an infinite number of hits per link update. 
This works very well in some cases, but in others the in- 
version is numerically so slow that, including CPU time 
in the balance sheet, a Metropolis scheme stays far more 
efficient than the HBA (which for many models has not 
even been constructed). 



In a previous paper [3] two of the present authors have 
shown that the MA can be biased so that it becomes 
an excellent approximation of the heat-bath updating, 
which was first introduced by Creutz [4] for SU(2) lattice 
gauge theory. The Biased Metropolis-Heat-bath Algo- 
rithm (BMHA) was illustrated for SU(2) and U(l) lat- 
tice gauge theories and the performance was found com- 
petitive with the best implementations of the heat-bath 
algorithm [5-8] for these models. In the present note we 
work out an example for which an efficient implemen- 
tation of the conventional heat-bath algorithm does not 
exist: SU(2) lattice gauge theory with the fundamental- 
adjoint action. 



II. THE MODEL 

The SU(2) fundamental-adjoint action is 

= f ER'=Tr(C/n)-f ^5](ReTr(t/n))^ . 

□ □ 

(1) 

Here Un ~ Ui^j^Uj^i^Ui^j^Uj^i^^ where the sum is over 
all plaquettes of a 4D simple hypercubic Nt lattice, 
and ii, ji, 12, J2 label the sites circulating around the 
plaquette and Uji is the SU(2) matrix associated with 
the link {ij). The reversed link is associated with the 
inverse matrix. 

This model has a bulk phase transition [9] along lines 
in the (/3/, /?„) plane, see [10] and references given therein 
for more detailed investigations of this transition. Fig. 1, 
extracted from Ref. [9,10], shows the location of the bulk 
transition together with the iVt = 4 deconfining phase 
transition line. While our aim is exclusively to improve 
the MC algorithm, we target the phase transition lines, 
interesting from the physics point of view and hence likely 
places for future simulations, when choosing coupling 
constant values for our test runs. Our test simulation 
points are also indicated in the figure. 
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FIG. 1. Phase diagram of SU(2) lattice gauge theory with 
the fundamental-adjoint action. The solid lines arc the bulk 
transition and the dotted line indicates the Nt = A deconfining 
transition. The five coupling constant values of our test runs 
are indicated by + signs. 



Our parameterization for SU(2) matrices is 



U — ao I + id ■ d, 



d^ = 1, 



(2) 



where I denotes the 2x2 identity matrix and a are the 
PauU matrices. A property of SU(2) group elements is 
that any sum of them is proportional to another SU(2) 
element. We define a SU(2) matrix Uu which corresponds 
to a sum of the staples in equation (1) by 



k=l 



\ 



det j ^ Uu,k 



(3) 



yk=l 



Here, Uu,k, k = 1, ..,6, denote the products of the three 
link matrices which, together with U, the link to be up- 
dated, form one of the six plaquettes containing the link 
to be updated. The main step for implementing a BMHA 
is the table building process. Here we proceed in two 
steps. First, wc use only the fundamental part and get 
the same table for the update variable ao and the pa- 
rameter Su as in [3]. This leads already to an increase 
of the Acceptance Rate (AR) by a factor of nearly ten 
compared with the full-range MA. We refer to this ap- 
proximation, which uses only the fundamental part of the 
action for table building, as BMHA-fund. As outlined in 
Ref. [3] such an updating table influences the efficiency 
through the AR, while the corresponding BMA is as ex- 
act as the usual MA. In Ref. [11] an essentially equivalent 
algorithm was proposed and used for the fundamental- 
adjoint SU(3) theory: do a Cabibbo-Marinari heatbath 
trial with the fundamental part of the gauge action and 
the accept or reject with a Metropolis step using the ad- 
joint part of the action. 

In a second step we construct our final BMHA by in- 
cluding the adjoint part of the action in a crude approxi- 



mation, which is technically easy to handle and sufficient 
to increase the AR further. The amount of the increase 
in AR is dependent on the two couplings (fundamental 
and adjoint). At the critical endpoint of the bulk first 
order transition line, for example, it is another 20% over 
that of the BMHA-fund to about 85%. 

To trace multiplicative factors clearly, we consider in 
the following the action of our theory in rf-dimensions. 
At each link wc have 2(d — 1) terms contributing to the 
□ sum. The link variables are SU(2) matrices in the fun- 
damental representation. A new link variable is proposed 
according to 



(4) 



where Ur is randomly chosen with the appropriate mea- 
sure and Uu is a normalized staple matrix. 

For constructing our BMHA we replace in the adjoint 
part each individual staple Uu,i by 



Uu,i 



UuA 



1 



2{d-l) 



Uu. 



(5) 



This means, for the table we neglect individual staples 
fluctuations in the adjoint part. Instead of the adjoint 
part of the action we use 



2(d-l) 



|i ^ (ReTr([/suf/u,,)) . 



(6) 



Using (4) and (5) this reduces to 

2(d-l) 
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Su 



2{d-l) 



(7) 



Nothing depends on the index of summation now, so the 
sum reduces to 2{d—l). Also, as before ReTr(t/'^) = 2ao, 
and we get for the adjoint contribution 



3 2{d-l) 



(8) 



The total expression for the probability density which we 
tabulate is 



P(ao) ~ \Jl-a^ exp Su ao + 



13a ^alsl 



3 2{d-l) 



(9) 



It has only one variable ao and one parameter Su- As 
in [3] we define a = /3/Su for programming convenience. 
This substitution leads here to 



P{ao) 



1 — an exp a oo + o 
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(10) 
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TABLE I. Simulation at (/3/,/3a) = (1-5, 0.9) on a 4* lattice 
relying on a statistics of 1000 sweeps for reaching equilibrium 
and 32 x 1000 swoops with measurements. Autocorrelation 
times are in units of MC sweeps. 





Metropolis 


BMHA-fund 


BMHA 




0.3451 (15) 


0.34636 (52) 


0.34694 (62) 


{{US)} 


0.6368 (15) 


0.63798 (47) 


0.63853 (56) 


nr.t{{Ui)) 


100.2 (8.6) 


19.5 (1.7) 


19.8 (2.5) 


r,„t((C/S» 


95.9 (8.0) 


17.1 (1.4) 


16.5 (2.2) 


AR in % 


6.5 (2) 


62.4 (4) 


85.2 (3) 



TABLE IL Relative efficiencies for our simulations on 4x8"'* 
lattices. In columns 3-5 the efficiency of the BMHA over 
the 5-hit MA is shown for 0, 1 and 2 overrelaxation sweeps 
(plain, lo and 2o). Columns 6 and 7 show how the BMHA is 
improved (or not) by additional overrelaxation hits. 



(ft , /9a) 


AR 


plain 


lo 


2o 


lobmha 


2obmha 


(1.5,0.9) 


0.84 


2.06 


1.53 


1.42 


1.23 


1.10 


(1.83,0.5) 


0.90 


1.76 


1.45 


1.38 


1.41 


1.37 


(1.2146, 1.25) 


0.79 


1.80 


1.74 


1.15 


0.93 


0.69 


(1.2,1.25) 


0.70 


1.46 


1.27 


1.23 


0.93 


0.74 


(1.23, 1.25) 


0.83 


1.50 


1.31 


1.28 


1.02 


0.84 



III. NUMERICAL RESULTS 

To give an idea of how update proposals with a dis- 
cretization of the probability density (10) work, we col- 
lect in table I the results of a short simulation on a 4^ 
lattice at (/3/,/3q) = (1.5,0.9). The observables are the 
plaquette expectation values in the fundamental and ad- 
joint representation, ({/□) and (f/S), respectively, and 
their integrated autocorrelation times, Tint{{Un)) and 
Tint{{Uu))- Here (•) denotes the average over a lattice 
configuration and ((•)) in table I the mean from all lat- 
tice configurations. Autocorrelation times and error bars 
are calculated as explained in [2]. While the estimated 
expectation values agree within statistical fluctuations, 
we find a dramatic increase of the AR from 6.5% for the 
plain (full-range) MA to 62.4% for the BMHA-fund and 
85.2% for the BMHA. This is accompanied by a decrease 
of the Tint values, which is obvious for the first improve- 
ment step and within the limited statistics hardly visible 
for the second step (although certainly true due to the 
higher acceptance rate). 

Our main goal is to evaluate the BMHA against a 
MC algorithm, which was previously tuned by one of 
the authors for optimal performance [12]. This is a n-hit 
Metropolis algorithm, with update proposals by multi- 
plying the old link matrix with an SU(2) matrix cen- 
tered around the unit element with a spread dynamically 
adjusted to give an acceptance rate of about 50% per 
Metropolis hit. Doing 5 hits was found most cost effec- 



TABLE HL Simulation at (/3/,/3a) = (1.2146,1.25) on a 

4x8^ lattice relying on a statistics of 2^'* = 16384 swoops 
for equilibration and 32 x 20480 sweeps with measurements. 
The CPU times are given in seconds. All other quantities are 
given in units of sweeps. 





Tint{{Ui,)) 


Tint{{US)) 


Ti„t{{L)) 


tunneling 


tcpu 


5h 


2294 (253) 


2262 (253) 


3430 (337) 


11.4 (1.2) 10'' 


18390 


lo5h 


1718 (137) 


1692 (136) 


2487 (238) 


5 900 (420) 


25522 


2o5h 


1209 (117) 


1193 (119) 


2131 (258) 


4 667 (340) 


32292 


bm 


1804 (155) 


1776 (153) 


2981 (270) 


8 323 (670) 


12958 


lobm 


1222 (111) 


1204 (110) 


2083 (299) 


5 190 (320) 


20573 


2obm 


1172 (079) 


1156 (078) 


1746 (173) 


4 520 (240) 


28295 
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FIG. 2. Probability density for the BMHA run (without 
overrelaxation) of table HI, x = {Ua)- 

tive. For simulations on 4 x 8^ lattices we compare the 
5-hit MA with our BMHA implementation for several 
(/3/, /?a) parameter values in the proximity of the bulk as 
well as the deconfining transition as shown in Fig. 1 and 
compiled in table H. The AR of the BMHA is listed in 
column 2 of table H. Depending on the coupling con- 
stant values it varies in the range from 70% to 90%. At 
all coupling constant values we have checked that the 
averages of our measured operators do, up to statistical 
fluctuation, not depend on the updating method. 

Including so-called overrelaxation sweeps [13-16] is 
known to reduce autocorrelation times, when the correla- 
tion length becomes large, for example close to a second 
order phase transition. For the fundamental-adjoint ac- 
tion, an exact overrelaxation step is not known to us. 
Instead we make a trial overrelaxation update with the 
fundamental part of the action and accept or reject the 
update according to the change in the adjoint part of 
the action. The acceptance rate for these overrelaxation 
sweeps decreases as the adjoint coupling becomes larger 
compared to the fundamental coupling. For the couplings 
considered here the acceptance rate for the overrelaxation 
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sweeps varied between 69% and 91%. 

In the subsequent tables, algorithms are encoded in the 
following way: 5h corresponds to the 5-hit Metropolis 
algorithm with the AR tuned to 50% per hit, bm to the 
BMHA, io, with i = 1, 2 to doing 1 or 2 overrelaxation 
sweeps after each MA or BMHA update. 

For {(3f,(3a) = (1.2146, 1.25) some of our data are com- 
piled in table III. This coupling constant point is pretty 
much on top of the bulk P*-order transition line, which 
leads to a double peak structure of the probability den- 
sity of many observables, as illustrated in Fig. 2 for the 
plaquette expectation value in the fundamental repre- 
sentation. Autocorrelation times of Polyakov loops (L) 
and tunneling times are also compiled in table III. Here 
the "tunneling time" is defined as the average number 
of sweeps the Markov process needs to propagate from 
one of the two maxima to the other and back. For all 
observables presented in table HI we see that switching 
from the 5-hit MA to the BMA reduces not only the inte- 
grated autocorrelation and tunneling times, but also the 
CPU times. 

The efficiency of an algorithmic approach 1 with re- 
spect to an algorithmic approach 2 is given by 

_g(l,2) ^ T{2)int t{2)cPU 

T{l)int t{l)cPU ' ^ ' 

This formula reflects that the algorithm which needs less 
CPU time and produces a smaller value for the integrated 
autocorrelation time is the more efficient one. The T(i)int 
values, and hence the efficiencies, depend somewhat on 
the operator chosen. Using Tint{{Un)) column 3 of ta- 
ble II collects the efficiencies found when comparing the 
BMHA with the 5-hit MA at our coupling constant val- 
ues, as given in column 1. Enhancements in the range 
1.46 to 2.06 are found. Using other operators gives some- 
what higher or lower efficiencies, but no systematic trend 
in either direction. For all operators we find always an 
improvement of the BMHA over the 5-hit MA. 

The values of column 3 of table II are reduced by in- 
cluding overrelaxation sweeps in both the 5-hit MA and 
the BMHA as is seen in columns 4 and 5. This comes 
because the overrelaxation sweeps add uniformly CPU 
time in both cases for which the integrated autocorrela- 
tion times get reduced by more or less the same fraction 
in case of the 5-hit MA as well as for the BMHA. In all 
cases the numbers in columns 4 and 5 of table II stay 
larger than 1, which means that the BMHA delivers al- 
ways the better performance. 

The last point is to consider whether the increase of 
CPU time for including overrelaxation sweeps in the 
BMHA is justified by the achieved decrease of integrated 
autocorrelation times or not. This is done by calculating 
the efficiency of the BMHA with one or two overrelax- 
ation sweeps with respect to the plain BMHA. The re- 
sults are given in the last two columns of table II. We 
see that in two cases the performance with overrelaxation 
sweeps is worse (numbers < 1) than for the plain BMHA. 



For another case there is almost no change, and in the 
two remaining cases one overrelaxation sweep (lo) be- 
fore each BMHA sweeps is best. The points for which 
the overrelaxation sweeps help are close to the decon- 
fining transition, where the correlation length is large 
and overrelaxation sweeps arc expected to be efficient, 
whereas the other three points are close to the bulk tran- 
sition. 

In conclusion, while the need for overrelaxation sweeps 
varies, the BMHA outperforms the 5-hit MA always. 
Once constructed the BMHA does (in contrast to the 5- 
hit MA) not need any fine-tuning of parameters, so that 
it then provides a straightforward approach to perform- 
ing pure lattice gauge theory simulations eflaciently. 
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